Train your own R1 reasoning model with Unsloth

GRPOが家庭用gpuで可能になった…

https://x.com/danielhanchen/status/1887564724071768529遂にVRAM16GBで14Bクラスのモデルの学習(QLora)が可能、まじかmorisoba65536.icon

Google Colabでも学習可能

VRAM48GBあれば70Bクラスも学習可能とか

最小要件はVRAM7GB、モデルサイズは1.5Bから確認されている

https://x.com/UnslothAI/status/1892640995847901684https://unsloth.ai/blog/grpounsloth.icon更にメモリを減らしてVRAM5GBから使えるようになった

Q:どんくらいすごいの？

A:https://x.com/gclue_akira/status/1887760201669136825 8xH100が必要だった学習に、無料のGoogle ColabやローカルのRTX 4060ti 16GBとかで出来るようになった(しかもそれなりに現実的な時間で)

多分これを使ってる